Search CORE

169 research outputs found

FunTree: a resource for exploring the functional evolution of structurally defined enzyme superfamilies

Author: A. L. Cuff
Bartlett
Bashton
Brown
C. A. Orengo
Furnham
G. L. Holliday
Glasner
I. Sillitoe
J. M. Thornton
Kanehisa
N. Furnham
Orengo
Porter
R. A. Laskowski
Rentzsch
S. A. Rahman
Schnoes
Shi
Todd
Valdar
Publication venue: Oxford University Press
Publication date: 17/10/2011
Field of study

FunTree is a new resource that brings together sequence, structure, phylogenetic, chemical and mechanistic information for structurally defined enzyme superfamilies. Gathering together this range of data into a single resource allows the investigation of how novel enzyme functions have evolved within a structurally defined superfamily as well as providing a means to analyse trends across many superfamilies. This is done not only within the context of an enzyme's sequence and structure but also the relationships of their reactions. Developed in tandem with the CATH database, it currently comprises 276 superfamilies covering ∼1800 (70%) of sequence assigned enzyme reactions. Central to the resource are phylogenetic trees generated from structurally informed multiple sequence alignments using both domain structural alignments supplemented with domain sequences and whole sequence alignments based on commonality of multi-domain architectures. These trees are decorated with functional annotations such as metabolite similarity as well as annotations from manually curated resources such the catalytic site atlas and MACiE for enzyme mechanisms. The resource is freely available through a web interface: www.ebi.ac.uk/thorton-srv/databases/FunTree

Crossref

LSHTM Research Online

PubMed Central

Composite structural motifs of binding sites for delineating biological functions of proteins

Author: A Bairoch
A Fiorillo
A Rausell
A Stark
AC Joerger
AC Wallace
AG Murzin
Akira R. Kinjo
AM Schnoes
AR Kinjo
AR Kinjo
AR Kinjo
B Bollobás
B Dasgupta
B Louie
B Rost
BH Dessailly
C Branden
C Winter
CV Robinson
D Petrey
DJ Schuller
DM Chipman
E Krissinel
E Toyota
FP Davis
FP Davis
GM Santos
H Berman
H Kettenberger
Haruki Nakamura
I Friedberg
J Janin
J Shi
J Westbrook
JI Yeh
K Chen
K Henrick
K Kinoshita
K Kinoshita
K Kinoshita
K Okazaki
K Stenberg
L Xie
M Bashton
M Brylinski
M Kitayner
M Levitt
M Moertl
M Nardini
M Tyagi
M Yang
N Nagano
N Tuncbag
N Tuncbag
N Zhao
ND Gold
O Keskin
O Keskin
OC Redfern
Ozlem Keskin
P Cramer
P Shannon
PD Pawelek
R Koike
R Koike
R Rentzsch
R Sinha
RR Thangudu
S Kadono
SF Altschul
T Amemiya
T Kawabata
T Kawabata
TA Holland
TC Terwilliger
Y Loewenstein
Z Aung
ZX Xia
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2011
Field of study

Most biological processes are described as a series of interactions between proteins and other molecules, and interactions are in turn described in terms of atomic structures. To annotate protein functions as sets of interaction states at atomic resolution, and thereby to better understand the relation between protein interactions and biological functions, we conducted exhaustive all-against-all atomic structure comparisons of all known binding sites for ligands including small molecules, proteins and nucleic acids, and identified recurring elementary motifs. By integrating the elementary motifs associated with each subunit, we defined composite motifs which represent context-dependent combinations of elementary motifs. It is demonstrated that function similarity can be better inferred from composite motif similarity compared to the similarity of protein sequences or of individual binding sites. By integrating the composite motifs associated with each protein function, we define meta-composite motifs each of which is regarded as a time-independent diagrammatic representation of a biological process. It is shown that meta-composite motifs provide richer annotations of biological processes than sequence clusters. The present results serve as a basis for bridging atomic structures to higher-order biological phenomena by classification and integration of binding site structures.Comment: 34 pages, 7 figure

arXiv.org e-Print Archive

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

The role of viral genomics in understanding COVID-19 outbreaks in long-term care facilities

Author: Aggarwal D
Bashton M
Bharucha T
Bradley DT
Brown CS
Chand M
Connor T
COVID-19 Genomics UK (COG-UK) Consortium .
Goodfellow I
Hamilton WL
Meader EJ
Myers R
O'Grady J
Page AJ
Peacock SJ
Robson S
Shallcross L
Smith DL
Tumelty NM
Török ME
Zambon M
Publication venue
Publication date: 29/09/2021
Field of study

We reviewed all genomic epidemiology studies on COVID-19 in long-term care facilities (LTCFs) that had been published to date. We found that staff and residents were usually infected with identical, or near identical, SARS-CoV-2 genomes. Outbreaks usually involved one predominant cluster, and the same lineages persisted in LTCFs despite infection control measures. Outbreaks were most commonly due to single or few introductions followed by a spread rather than a series of seeding events from the community into LTCFs. The sequencing of samples taken consecutively from the same individuals at the same facilities showed the persistence of the same genome sequence, indicating that the sequencing technique was robust over time. When combined with local epidemiology, genomics allowed probable transmission sources to be better characterised. The transmission between LTCFs was detected in multiple studies. The mortality rate among residents was high in all facilities, regardless of the lineage. Bioinformatics methods were inadequate in a third of the studies reviewed, and reproducing the analyses was difficult because sequencing data were not available in many facilities

UCL Discovery

FLORA: a novel method to predict protein function from structure in diverse superfamilies

Predicting protein function from structure remains an active area of interest, particularly for the structural genomics initiatives where a substantial number of structures are initially solved with little or no functional characterisation. Although global structure comparison methods can be used to transfer functional annotations, the relationship between fold and function is complex, particularly in functionally diverse superfamilies that have evolved through different secondary structure embellishments to a common structural core. The majority of prediction algorithms employ local templates built on known or predicted functional residues. Here, we present a novel method (FLORA) that automatically generates structural motifs associated with different functional sub-families (FSGs) within functionally diverse domain superfamilies. Templates are created purely on the basis of their specificity for a given FSG, and the method makes no prior prediction of functional sites, nor assumes specific physico-chemical properties of residues. FLORA is able to accurately discriminate between homologous domains with different functions and substantially outperforms (a 2–3 fold increase in coverage at low error rates) popular structure comparison methods and a leading function prediction method. We benchmark FLORA on a large data set of enzyme superfamilies from all three major protein classes (α, β, αβ) and demonstrate the functional relevance of the motifs it identifies. We also provide novel predictions of enzymatic activity for a large number of structures solved by the Protein Structure Initiative. Overall, we show that FLORA is able to effectively detect functionally similar protein domain structures by purely using patterns of structural conservation of all residues

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

UCL Discovery

PubMed Central

Spatial growth rate of emerging SARS-CoV-2 lineages in England, September 2020–December 2021

Author: Bashton Matthew
Cliff A. D.
McCann Clare
Nelson Andrew
Smallman-Raynor M. R.
Smith Darren
The (COG-UK) Consortium
Young Greg
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 20/07/2022
Field of study

This paper uses a robust method of spatial epidemiological analysis to assess the spatial growth rate of multiple lineages of SARS-CoV-2 in the local authority areas of England, September 2020-December 2021. Using the genomic surveillance records of the COVID-19 Genomics UK (COG-UK) Consortium, the analysis identifies a substantial (7.6-fold) difference in the average rate of spatial growth of 37 sample lineages, from the slowest (Delta AY.4.3) to the fastest (Omicron BA.1). Spatial growth of the Omicron (B.1.1.529 and BA) variant was found to be 2.81× faster than the Delta (B.1.617.2 and AY) variant and 3.76× faster than the Alpha (B.1.1.7 and Q) variant. In addition to AY.4.2 (a designated variant under investigation, VUI-21OCT-01), three Delta sublineages (AY.43, AY.98 and AY.120) were found to display a statistically faster rate of spatial growth than the parent lineage and would seem to merit further investigation. We suggest that the monitoring of spatial growth rates is a potentially valuable adjunct to outbreak response procedures for emerging SARS-CoV-2 variants in a defined population

Northumbria Research Link

DODO: an efficient orthologous genes assignment tool based on domain architectures. Domain based ortholog detection

Author: A Kuzniar
C Vogel
CE Storm
CE Storm
CM Zmasek
EV Kriventseva
EW Sayers
F Delsuc
G Ostlund
M Ashburner
M Bashton
M Levitt
M Pellegrini
M Remm
R Jothi
RD Finn
RD Finn
RL Tatusov
RT van der Heijden
Timothy H Wu
Ting-wen Chen
TJ Hubbard
Wailap V Ng
Wen-chang Lin
WM Fitch
WM Fitch
Z Fu
Z Fu
Publication venue: BioMed Central
Publication date: 01/10/2010
Field of study

Abstract Background Orthologs are genes derived from the same ancestor gene loci after speciation events. Orthologous proteins usually have similar sequences and perform comparable biological functions. Therefore, ortholog identification is useful in annotations of newly sequenced genomes. With rapidly increasing number of sequenced genomes, constructing or updating ortholog relationship between all genomes requires lots of effort and computation time. In addition, elucidating ortholog relationships between distantly related genomes is challenging because of the lower sequence similarity. Therefore, an efficient ortholog detection method that can deal with large number of distantly related genomes is desired. Results An efficient ortholog detection pipeline DODO (DOmain based Detection of Orthologs) is created on the basis of domain architectures in this study. Supported by domain composition, which usually directly related with protein function, DODO could facilitate orthologs detection across distantly related genomes. DODO works in two main steps. Starting from domain information, it first assigns protein groups according to their domain architectures and further identifies orthologs within those groups with much reduced complexity. Here DODO is shown to detect orthologs between two genomes in considerably shorter period of time than traditional methods of reciprocal best hits and it is more significant when analyzed a large number of genomes. The output results of DODO are highly comparable with other known ortholog databases. Conclusions DODO provides a new efficient pipeline for detection of orthologs in a large number of genomes. In addition, a database established with DODO is also easier to maintain and could be updated relatively effortlessly. The pipeline of DODO could be downloaded from <url>http://140.109.42.19:16080/dodo_web/home.htm</url></p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Emergence and maintenance of actionable genetic drivers at medulloblastoma relapse

Author: André N
Bailey S
Bashton M
Clifford SC
Crosier S
Figarella-Branger D
Grabovksa Y
Hansford JR
Hicks D
Hill RM
Jacques TS
Jorgensen M
Joshi A
Keeling C
Kui C
Lastowska M
Lindsey JC
Michalski A
Pease L
Pfister SM
Pickles JC
Pizer B
Ramaswamy V
Richardson S
Schwalbe EC
Taylor MD
Vinci M
Wharton SB
Williamson D
Zakrzewski K
Publication venue
Publication date: 01/01/2022
Field of study

BACKGROUND: 90% of tumors) and established genetic drivers (e.g. SHH/WNT/P53 mutations; 60% of rMB events) were maintained from diagnosis. Critically, acquired and maintained rMB events converged on targetable pathways which were significantly enriched at relapse (e.g. DNA damage-signaling) and specific events (e.g. 3p loss) predicted survival post-relapse. CONCLUSIONS: rMB is defined by the emergence of novel events and pathways, in concert with selective maintenance of established genetic drivers. Together, these define the actionable genetic landscape of rMB and provide a basis for improved clinical management and development of stratified therapeutics, across disease-course

UCL Discovery

Integrative genomic analysis of childhood acute lymphoblastic leukaemia lacking a genetic biomarker in the UKALL2003 clinical trial

Author: Bain R
Barretta E
Bashton M
Bentley DR
Butler E
Cranston RE
Enshaei A
Gibson J
Harrison CJ
Hawking Z
Hinchliffe AC
Kingsbury Z
Moorman AV
Murray J
Peden JF
Ross MT
Russell LJ
Ryan SL
Schwab C
Vora A
Winterman E
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date
Field of study

Newcastle University E-Prints

CLIMB-COVID: continuous integration supporting decentralised sequencing for SARS-CoV-2 genomic surveillance.

Funder: Wellcome TrustIn response to the ongoing SARS-CoV-2 pandemic in the UK, the COVID-19 Genomics UK (COG-UK) consortium was formed to rapidly sequence SARS-CoV-2 genomes as part of a national-scale genomic surveillance strategy. The network consists of universities, academic institutes, regional sequencing centres and the four UK Public Health Agencies. We describe the development and deployment of CLIMB-COVID, an encompassing digital infrastructure to address the challenge of collecting and integrating both genomic sequencing data and sample-associated metadata produced across the COG-UK network

University of Liverpool Repository

Northumbria Research Link

Directory of Open Access Journals

Edinburgh Research Explorer

Apollo (Cambridge)

Combinatorial Clustering of Residue Position Subsets Predicts Inhibitor Affinity across the Human Kinome

Author: BH Dessailly
C Fraley
C Schalon
CC Chang
D Huang
D Kuhn
DH Bryant
Drew H. Bryant
DW Kim
ED Scheeff
F Glaser
F Milletti
G Manning
JA Bikker
JW Torrance
K Mizuguchi
L Hu
L Xie
Lydia E. Kavraki
M Ashburner
M Bashton
M Magrane
M Moll
Mark Moll
MJ McGregor
Mona Singh
MW Karaman
N Hulo
P Cohen
P de Matos
P Rousseeuw
Paul W. Finn
R Wang
RD Finn
S Schmitt
SL Kinnings
T Liu
T Liu
Y Liu
Publication venue
Publication date: 01/01/2012
Field of study

The protein kinases are a large family of enzymes that play fundamental roles in propagating signals within the cell. Because of the high degree of binding site similarity shared among protein kinases, designing drug compounds with high specificity among the kinases has proven difficult. However, computational approaches to comparing the 3-dimensional geometry and physicochemical properties of key binding site residue positions have been shown to be informative of inhibitor selectivity. The Combinatorial Clustering Of Residue Position Subsets (CCORPS) method, introduced here, provides a semi-supervised learning approach for identifying structural features that are correlated with a given set of annotation labels. Here, CCORPS is applied to the problem of identifying structural features of the kinase ATP binding site that are informative of inhibitor binding. CCORPS is demonstrated to make perfect or near-perfect predictions for the binding affinity profile of 8 of the 38 kinase inhibitors studied, while only having overall poor predictive ability for 1 of the 38 compounds. Additionally, CCORPS is shown to identify shared structural features across phylogenetically diverse groups of kinases that are correlated with binding affinity for particular inhibitors; such instances of structural similarity among phylogenetically diverse kinases are also shown to not be rare among kinases. Finally, these function-specific structural features may serve as potential starting points for the development of highly specific kinase inhibitors

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

DSpace at Rice University

FigShare